Application of Refined LSA and MD5 Algorithms in Spam Filtering

نویسندگان

  • Jingtao Sun
  • Qiuyu Zhang
  • Zhanting Yuan
چکیده

The paper proposes a spam filtering method that uses integrated and refined Latent Semantic Analysis (LSA) and Message-Digest Algorithm 5 (MD5) algorithms to address a series of universal problems in spam filtering, including remarkably lowered filtering precision and notably unbalanced filtering efficiency as a result of lack of latent semantic analysis of mail contents. In introducing LSA, its weighting function is improved by integrating fuzzy membership to improve effectiveness of LSA in processing mail contents. On top of this, MD5 algorithm is used to generate “E-mail fingerprint”, thus enabling quick matching and realizing highly efficient and accurate processing of mass-mailing spam. The result of the simulation experiment testifies effectiveness of the method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-Parametric Spam Filtering based on kNN and LSA

The paper proposes a non-parametric approach to filtering of unsolicited commercial e-mail messages, also known as spam. The email messages text is represented as an LSA vector, which is then fed into a kNN classifier. The method shows a high accuracy on a collection of recent personal email messages. Tests on the standard LINGSPAM collection achieve an accuracy of over 99.65%, which is an impr...

متن کامل

Content-Based Spam Filtering on Video Sharing Social Networks

In this work we are concerned with the detection of spam in video sharing social networks. Specifically, we investigate how much visual content-based analysis can aid in detecting spam in videos. This is a very challenging task, because of the high-level semantic concepts involved; of the assorted nature of social networks, preventing the use of constrained a priori information; and, what is pa...

متن کامل

Filtering Image Spam with Near-Duplicate Detection

A new trend in email spam is the emergence of image spam. Although current anti-spam technologies are quite successful in filtering text-based spam emails, the new image spams are substantially more difficult to detect, as they employ a variety of image creation and randomization algorithms. Spam image creation algorithms are designed to defeat well-known vision algorithms such as optical chara...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure

E-mail is the most prevalent methods for correspondence because of its availability, quick message exchange and low sending cost. Spam mail appears as a serious issue influencing this application today's internet. Spam may contain suspicious URL’s, or may ask for financial information as money exchange information or credit card details. Here comes the scope of filtering spam from legitimate em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCP

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2009